Stanislav Borysov [stabo@dtu.dk], DTU Management
In this tutorial, we will take a look at various ways to explain potential black-box machine learning models in a model-agnostic way. We will be working on a real-world dataset on Census income available in the UCI ML Repository, also known as "Adult" dataset, where we will be predicting if the potential income of people is more than $50K/yr or not.
The purpose of this tutorial is manifold. The first main objective is to familiarize ourselves with the major state-of-the-art model interpretation frameworks out there (a lot of them being extensions of LIME - the original frmework and approach proposed for model interpretation).
LIME (short for local interpretable model-agnostic explanations) is based on the work presented in the paper, "Why Should I Trust You?": Explaining the Predictions of Any Classifier" which talks about a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction. It also covers a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way.
We cover usage of the following model interpretation frameworks in our tutorial.
The major model interpretation techniques we will be covering in this tutorial include the following.
Load necessary dependencies
import pandas as pd
import numpy as np
import random
import model_evaluation_utils as meu
import matplotlib.pyplot as plt
from collections import Counter
import warnings
warnings.filterwarnings('ignore')
plt.style.use('fivethirtyeight')
%matplotlib inline
RANDOM_SEED = 42 # set to None
np.random.seed(RANDOM_SEED)
random.seed(RANDOM_SEED)
Load the Census Income Dataset
def get_adult():
df = pd.read_csv("adult.csv", dtype={
'Workclass': 'category',
'Marital Status': 'category',
'Occupation': 'category',
'Relationship': 'category',
'Race': 'category',
'Sex': 'category',
'Country': 'category',
})
labels = df["label"].values
data = df.loc[:, df.columns != 'label']
return data, labels
data, labels = get_adult()
# alternatively, one can get the dataset from the shap package
#data, labels = shap.datasets.adult(display=True)
labels = np.array([int(label) for label in labels])
data.shape, labels.shape
Let's now take a look at our dataset attributes
| Attribute Name | Type | Description |
|---|---|---|
| Age | Continuous | Represents age of the person |
| Workclass | Categorical | Represents the nature of working class\category (Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked) |
| Education-Num | Categorical | Numeric representation of educational qualification. Ranges from 1-16. (Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool) |
| Marital Status | Categorical | Represents the marital status of the person (Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse) |
| Occupation | Categorical | Represents the type of profession\job of the person (Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces) |
| Relationship | Categorical | Represents the relationship status of the person (Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried) |
| Race | Categorical | Represents the race of the person (White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black) |
| Sex | Categorical | Represents the gender of the person (Female, Male) |
| Capital Gain | Continuous | The total capital gain for the person |
| Capital Loss | Continuous | The total capital loss for the person |
| Hours per week | Continuous | Total hours spent working per week |
| Country | Categorical | The country where the person is residing |
| Income Label (labels) | Categorical (class label) | The class label column is the one we want to predict (False: Income <= \$50K & True: Income > \$50K) |
We have a total of 12 features and our objective is to predict if the income of a person will be more than \$50K (True) or less than \$50K (False). Hence we will be building and interpreting a classification model
data.head()
data.info()
Here we convert the categorical columns with string values to numeric representations. Typically the XGBoost model can handle categorical data natively being a tree-based model so we don't one-hot encode the features
cat_cols = data.select_dtypes(['category']).columns
data[cat_cols] = data[cat_cols].apply(lambda x: x.cat.codes)
data.head()
Distribution of people with <= \$50K (False) and > \$50K (True) income
Counter(labels)
For any machine learning model, we always need train and test datasets. We will be building the model on the train dataset and test the performance on the test dataset. We maintain two datasets (one with the encoded categorical values and one with the original values) so we can train with the encoded dataset but use the original dataset as needed later on for model interpretation.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
data, labels, test_size=0.3, random_state=RANDOM_SEED
)
X_train.shape, X_test.shape
X_train.head()
train_ind, test_ind = X_train.index.values, X_test.index.values
data_disp, labels_disp = get_adult()
X_train_disp, X_test_disp = data_disp.loc[train_ind], data_disp.loc[test_ind]
y_train_disp, y_test_disp = labels_disp[train_ind], labels_disp[test_ind]
X_train_disp.shape, X_test_disp.shape
X_train_disp.head()
We will now train and build a basic boosting classification model on our training data using the popular XGBoost framework, an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable
From Wikipedia:
Gradient boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion like other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differentiable loss function.
Boosted trees are known to be one of the most powerful and widely used machine learning models besides neural networks. You can read more about the algorithm here.
WARNING! THIS NOTEBOOK REQUIRES XGBOOST LIBRARY 0.81
pip install -U xgboost==0.81
To install xgboost on Mac, you need to install cmake first https://cmake.org/download/ and add CMake to the PATH (run in terminal):
PATH="/Applications/CMake.app/Contents/bin":"$PATH"
import xgboost as xgb
%%time
# measure execution time just for fun
xgc = xgb.XGBClassifier(n_estimators=500, max_depth=5, base_score=0.5,
objective='binary:logistic', random_state=RANDOM_SEED)
xgc.fit(X_train, y_train)
As in Part 1, we do not do any hyperparameter tuning. Please feel free to do it to improve the model.
Let's evaluate how our model has performed with its predictions on the test data.
We use model_evaluation_utils module for this which leverages scikit-learn internally to give us standard classification model evaluation metrics.
predictions = xgc.predict(X_test)
class_labels = list(set(labels))
meu.display_model_performance_metrics(
true_labels=y_test, predicted_labels=predictions, classes=class_labels
)
By default, it is difficult to gauge specific model interpretation methods for machine learning models out of the box. Parametric models like logistic regression are easier to interpret given that the total number of parameters of the model is fixed regardless of the volume of data and one can make some interpretation of the model's prediction decisions leveraging the parameter coefficients.
Non-parametric models are harder to interpret given that the total number of parameters remains unbounded and increases with the increase in the data volume. Some non-parametric models like tree-based models do have some out of the box model interpretation methods like feature importance which helps us in understanding which features might be influential in the model making its prediction decisions
Classic feature importances from XGBoost
Here we try out the global feature importance calculations that come with XGBoost. The model enables us to view feature importances based on the following.
Note that they all contradict each other, which motivates the use of model interpretation frameworks like SHAP which uses something known as SHAP values, which claim to come with consistency guarantees (meaning they will typically order the features correctly).
fig = plt.figure(figsize = (16, 12))
title = fig.suptitle("Default Feature Importances from XGBoost", fontsize=14)
ax1 = fig.add_subplot(2,2, 1)
xgb.plot_importance(xgc, importance_type='weight', ax=ax1)
t=ax1.set_title("Feature Importance - Feature Weight")
ax2 = fig.add_subplot(2,2, 2)
xgb.plot_importance(xgc, importance_type='gain', ax=ax2)
t=ax2.set_title("Feature Importance - Split Mean Gain")
ax3 = fig.add_subplot(2,2, 3)
xgb.plot_importance(xgc, importance_type='cover', ax=ax3)
t=ax3.set_title("Feature Importance - Sample Coverage")
ELI5 is a Python package that helps to debug machine learning classifiers and explain their predictions in an easy to understand and intuitive way. It is perhaps the easiest of the three machine learning frameworks to get started with since it involves a minimal reading of documentation! However, it doesn't support true model-agnostic interpretations and support for models being mostly limited to tree-based and other parametric\linear models. Let's look at some intuitive ways of model interpretation with ELI5 on our classification model.
Installation Instructions
We recommend installing this framework using
pip install eli5
since the conda version appears to be a bit out-dated. Also, feel free to check out the documentation as needed.
import eli5
Typically for tree-based models, ELI5 does nothing special but uses the out-of-the-box feature importance computation methods which we discussed in the previous section. By default, 'gain' is used, that is the average gain of the feature when it is used in trees.
eli5.show_weights(xgc)
One of the best ways to explain model prediction decisions to either a technical or a more business-oriented individual is to examine individual data-point predictions. Typically, ELI5 does this by showing weights for each feature depicting how influential it might have been in contributing to the final prediction decision across all trees. The idea for weight calculation is described in http://blog.datadive.net/interpreting-random-forests/; ELI5 provides an independent implementation of this algorithm for XGBoost and most scikit-learn tree ensembles which is definitely on the path towards model-agnostic interpretation but not purely model-agnostic like LIME.
Typically, the prediction can be defined as the sum of the feature contributions + the “bias” (i.e. the mean given by the topmost region that covers the entire training set)
predictions[:10]
Here we can see the most influential features being the Age, Hours per week, Marital Status, Occupation & Relationship
doc_num = 0
print('Actual Label:', y_test[doc_num])
print('Predicted Label:', predictions[doc_num])
eli5.show_prediction(xgc.get_booster(), X_test.iloc[doc_num],
feature_names=list(data.columns), show_feature_values=True)
Here we can see the most influential features being the Education, Relationship, Occupation, Hours per week & Marital Status
doc_num = 2
print('Actual Label:', y_test[doc_num])
print('Predicted Label:', predictions[doc_num])
eli5.show_prediction(xgc.get_booster(), X_test.iloc[doc_num],
feature_names=list(data.columns) ,show_feature_values=True)
It is definitely interesting to see how similar features play an influential role in explaining model prediction decisions for both classes!
Skater is a unified framework to enable Model Interpretation for all forms of models to help one build an Interpretable machine learning system often needed for real-world use-cases using a model-agnostic approach. It is an open-source python library designed to demystify the learned structures of a black box model both globally(inference on the basis of a complete data set) and locally(inference about an individual prediction).
<img src="global_vs_local.png", width="600"/>
Skater originally started off as a fork of LIME but then broke out as an independent framework of its own with a wide variety of features and capabilities for model-agnostic interpretation for any black-box models. The project was started as a research idea to find ways to enable better interpretability(preferably human interpretability) to predictive "black boxes" both for researchers and practitioners.
Install Skater
You can typically install Skater using a simple
pip install skater
For detailed information on the dependencies and installation instruction check out installing skater.
📖 Documentation
We recommend you to check out the detailed documentation of Skater
| Overview | Introduction to the Skater library |
| Installing | How to install the Skater library |
| Tutorial | Steps to use Skater effectively. |
| API Reference | The detailed reference for Skater's API. |
| Contributing | Guide to contributing to the Skater project. |
| Examples | Interactive notebook examples |
Algorithms
+---------+---------+-----+-----------+-----------+--------------+--------------+--------------------+------------------+
| Scope of Interpretation | Algorithms |
+=========+=========+=====+===========+===========+==============+==============+=======================================+
| Global Interpretation | `Model agnostic Feature Importance <https://tinyurl.com/feature-importance>`_ |
+---------+---------+-----+-----------+-----------+--------------+--------------+--------------------+------------------+
| Global Interpretation | `Model agnostic Partial Dependence Plots <https://tinyurl.com/partial-dependence>`_ |
+---------+---------+-----+-----------+-----------+--------------+--------------+--------------------+------------------+
| Local Interpretation | `Local Interpretable Model Explanation(LIME) <https://tinyurl.com/lime-explanation>`_ |
+---------+---------+-----+-----------+-----------------------------------------+--------------------+------------------+
| Local Interpretation | DNNs | - `Layer-wise Relevance Propagation <https://tinyurl.com/e-layerwise>`_ |
| | | (e-LRP): image |
| | | |
| | | - `Occlusion <https://tinyurl.com/dnn-occlusion>`_ : image |
| | | |
| | | - `Integrated Gradient <https://tinyurl.com/integrated-gradient>`_ |
| | | image and text |
+---------+---------+-----+-----------+-----------------------------------------+--------------------+------------------+
| Global and Local | `Scalable Bayesian Rule Lists <https://tinyurl.com/rule-list-sbr>`_ |
| Interpretation | |
| | `Tree Surrogates <https://tinyurl.com/treesurrogates>`_ |
+---------+---------+-----+-----------+-----------+--------------+--------------+--------------------+------------------+
Usage and Examples
Since the project is under active development, the best way to understand usage would be to follow the examples mentioned in the Gallery of Interactive Notebook
A predictive model is a mapping from an input space to an output space. Interpretation algorithms are divided into those that offer statistics and metrics on regions of the domain, such as the marginal distribution of a feature, or the joint distribution of the entire training set. In an ideal world, there would exist some representation that would allow a human to interpret a decision function in any number of dimensions. Given that we generally can only intuit visualizations of a few dimensions at once, global interpretation algorithms either aggregate or subset the feature space.
Currently, model-agnostic global interpretation algorithms supported by skater include partial dependence and feature importance with a very new release of tree-surrogates also. We will be covering feature importance and partial dependence plots here
The general workflow within the skater package is to create an interpretation, create a model, and run interpretation algorithms. Typically, an Interpretation consumes a dataset, and optionally some metadata like feature names and row ids. Internally, the Interpretation will generate a DataManager to handle data requests and sampling.
Local Models(InMemoryModel): To create a skater model based on a local function or method, pass in the predict function to an InMemoryModel. A user can optionally pass data samples to the examples keyword argument. This is only used to infer output types and formats. Out of the box, skater allows models to return numpy arrays and pandas dataframes.
Operationalized Model(DeployedModel): If your model is accessible through an API, use a DeployedModel, which wraps the requests library. DeployedModels require two functions, an input formatter and an output formatter, which speak to the requests library for posting and parsing. The input formatter takes a pandas DataFrame or a numpy ndarray, and returns an object (such as a dict) that can be converted to JSON to be posted. The output formatter takes a requests.response as an input and returns a numpy ndarray or pandas DataFrame.
We will use the following workflow:
from skater.core.explanations import Interpretation
from skater.model import InMemoryModel
interpreter = Interpretation(training_data=X_test, training_labels=y_test,
feature_names=list(data.columns))
im_model = InMemoryModel(xgc.predict_proba, examples=X_train,
target_names=['$50K or less', 'More than $50K'])
Feature importance is a generic term for the degree to which a predictive model relies on a particular feature. The skater feature importance implementation is based on an information-theoretic criterion, measuring the entropy in the change of predictions, given a perturbation of a given feature. The intuition is that the more a model's decision criteria depend on a feature, the more we'll see predictions change as a function of perturbing a feature. The default method used is prediction-variance which is the mean absolute value of changes in predictions, given perturbations in the data.
plots = interpreter.feature_importance.plot_feature_importance(im_model,
ascending=True)
Partial Dependence describes the marginal impact of a feature on model prediction, holding other features in the model constant. The derivative of partial dependence describes the impact of a feature (analogous to a feature coefficient in a regression model). This has been adapted from T. Hastie, R. Tibshirani and J. Friedman, Elements of Statistical Learning Ed. 2, Springer. 2009.
The partial dependence plot (PDP or PD plot) shows the marginal effect of a feature on the predicted outcome of a previously fit model. PDPs can show if the relationship between the target and a feature is linear, monotonic or more complex. Skater can show 1-D as well as 2-D PDPs
PDP of 'Age' affecting model prediction
Looks like the middle-aged people have a slightly higher chance of making more money as compared to younger or older people
r = interpreter.partial_dependence.plot_partial_dependence(
['Age'], im_model, grid_resolution=50, grid_range=(0,1),
with_variance=True, figsize=(6, 4)
)
yl = r[0][1].set_ylim(0, 1)
PDP of 'Education Num' affecting model prediction
Looks like higher the education level, the better the chance of making more money. Not surprising!
r = interpreter.partial_dependence.plot_partial_dependence(
['Education-Num'], im_model, grid_resolution=50,
grid_range=(0,1), with_variance=True, figsize=(6, 4)
)
yl = r[0][1].set_ylim(0, 1)
PDP of 'Capital Gain' affecting model prediction
Unsurprisingly higher the capital gain, the more chance of making money, there is a steep rise in around \$5K - \$8K.
r = interpreter.partial_dependence.plot_partial_dependence(
['Capital Gain'], im_model, grid_resolution=50,
grid_range=(0,1), with_variance=True, figsize=(8, 4)
)
yl = r[0][1].set_ylim(0, 1)
s, e = r[0][1].get_xlim()
xl = r[0][1].set_xticks(np.arange(s, e, 10000))
PDP of 'Relationship' affecting model prediction
Remember that relationship is coded as a categorical variable with numeric representations. Let's first look at how it is represented.
pd.concat([data_disp[['Relationship']], data[['Relationship']]], axis=1)\
.drop_duplicates()
Interesting definitely that married folks (husband-wife) have a higher chance of making more money than others!
r = interpreter.partial_dependence.plot_partial_dependence(
['Relationship'], im_model, grid_resolution=50,
grid_range=(0,1), with_variance=True, figsize=(6, 4)
)
yl = r[0][1].set_ylim(0, 1)
Two-way PDP showing interactions between features 'Age' and 'Education-Num' and their effect on making more than \$50K
We run a deeper model interpretation here over all the data samples, trying to see interactions between Age and Education-Numand also their effect on the probability of the model predicting if the person will make more money, with the help of a two-way partial dependence plot.
Interesting to see higher the education level and the middle-aged folks (30-50) having the highest chance of making more money!
plots_list = interpreter.partial_dependence.plot_partial_dependence(
[('Age', 'Education-Num')], im_model, grid_range=(0,1),
figsize=(12, 5), grid_resolution=50
)
Two-way PDP showing interactions between features 'Education-Num' and 'Capital Gain' and their effect on making more than \$50K
We run a deeper model interpretation here over all the data samples, trying to see interactions between Education-Num and Capital Gainand also their effect on the probability of the model predicting if the person will make more money, with the help of a two-way partial dependence plot.
Basically having a better education and more capital gain leads to you making more money!
plots_list = interpreter.partial_dependence.plot_partial_dependence(
[('Education-Num', 'Capital Gain')], im_model, grid_range=(0,1),
figsize=(12, 5), grid_resolution=100
)
Local Interpretation could be possibly be achieved in two ways. Firstly, one could possibly approximate the behavior of a complex predictive model in the vicinity of a single input using a simple interpretable auxiliary or surrogate model (e.g. Linear Regressor). Secondly, one could use the base estimator to understand the behavior of a single prediction using intuitive approximate functions based on inputs and outputs.
LIME is an algorithm designed by Riberio Marco, Singh Sameer, Guestrin Carlos to access the behavior of the any base estimator(model) using interpretable surrogate models (e.g. linear classifier/regressor). Such form of comprehensive evaluation helps in generating explanations which are locally faithful but may not align with the global behavior. Basically, LIME explanations are based on local surrogate models. These, surrogate models are interpretable models (like a linear model or decision tree) that are learned on the predictions of the original black box model. But instead of trying to fit a global surrogate model, LIME focuses on fitting local surrogate models to explain why single predictions were made.
The idea is very intuitive. To start with, just try and unlearn what you have done so far! Forget about the training data, forget about how your model works! Think that your model is a black box model with some magic happening inside, where you can input data points and get the models predicted outcomes. You can probe this magic black box as often as you want with inputs and get output predictions.
Now, you main objective is to understand why the machine learning model which you are treating as a magic black box, gave the outcome it produced. LIME tries to do this for you! It tests out what happens to you black box model's predictions when you feed variations or perturbations of your dataset into the black box model. Typically, LIME generates a new dataset consisting of perturbed samples and the associated black box model's predictions. On this dataset LIME then trains an interpretable model weighted by the proximity of the sampled instances to the instance of interest. Following is a standard high-level workflow for this.
We recommend you to read the LIME chapter in Christoph Molnar's excellent book on Model Interpretation which talks about this in detail.
Skater can leverage LIME to explain model predictions. Typically, its LimeTabularExplainer class helps in explaining predictions on tabular (i.e. matrix) data. For numerical features, it perturbs them by sampling from a Normal(0,1) and doing the inverse operation of mean-centering and scaling, according to the means and stds in the training data. For categorical features, it perturbs by sampling according to the training distribution, and making a binary feature that is 1 when the value is the same as the instance being explained. The explain_instance() function generates explanations for a prediction. First, we generate neighborhood data by randomly perturbing features from the instance. We then learn locally weighted linear (surrogate) models on this neighborhood data to explain each of the classes in an interpretable way.
Since XGBoost has some issues with feature name ordering when building models with dataframes, we will build our same model with numpy arrays to make LIME work without additional hassles of feature re-ordering. Remember the model being built is the same ensemble model which we treat as our black box machine learning model.
xgc_np = xgb.XGBClassifier(n_estimators=500, max_depth=5, base_score=0.5,
objective='binary:logistic', random_state=RANDOM_SEED)
xgc_np.fit(X_train.values, y_train)
from skater.core.local_interpretation.lime.lime_tabular import LimeTabularExplainer
exp = LimeTabularExplainer(X_test.values, feature_names=list(data.columns),
discretize_continuous=True,
class_names=['$50K or less', 'More than $50K'])
Predicting when a person's income <= \$50K
Skater gives a nice reasoning below showing which features were the most influential in the model taking the correct decision of predicting the person's income as below \$50K.
doc_num = 0
print('Actual Label:', y_test[doc_num])
print('Predicted Label:', predictions[doc_num])
exp.explain_instance(X_test.iloc[doc_num].values,
xgc_np.predict_proba).show_in_notebook()
Predicting when a person's income > \$50K
Skater gives a nice reasoning below showing which features were the most influential in the model taking the correct decision of predicting the person's income as above \$50K.
doc_num = 2
print('Actual Label:', y_test[doc_num])
print('Predicted Label:', predictions[doc_num])
exp.explain_instance(X_test.iloc[doc_num].values,
xgc_np.predict_proba).show_in_notebook()
We have see various ways to interpret machine learning models with features, dependence plots and even LIME. But can we build an approximation or a surrogate model which is more interpretable from a really complex black box model like our XGBoost model having hundreds of decision trees?
Here in, we introduce the novel idea of using TreeSurrogates as means for explaining a model's learned decision policies (for inductive learning tasks), which is inspired by the work of Mark W. Craven described as the TREPAN algorithm.
We recommend checking out the following excellent papers on the TREPAN algorithm to build surrogate trees.
Briefly, Trepan constructs a decision tree in a best-first manner. It maintains a queue of leaves which are expanded into subtrees as they are removed from the queue. With each node in the queue, Trepan stores,
The stored subset of training examples consists simply of those examples that reach the node. The query instances are used, along with the training examples, to select the splitting test if the node is an internal node or to determine the class label if it is a leaf. The constraint set describes the conditions that instances must satisfy in order to reach the node; this information is used when drawing a set of query instances for a newly created node. The process of expanding a node in Trepan is much like it is in conventional decision tree algorithms: a splitting test is selected for the node, and a child is created for each outcome of the test. Each child is either made a leaf of the tree or put into the queue for future expansion.
For Skater's implementation, for building explainable surrogate models, the base estimator(Oracle) could be any form of a supervised learning predictive model - our black box model. The explanations are approximated using Decision Trees(both for Classification/Regression) by learning decision boundaries similar to that learned by the Oracle (predictions from the base model are used for learning the Decision Tree representation). The implementation also generates a fidelity score to quantify tree based surrogate model’s approximation to the Oracle. Ideally, the score should be 0 for truthful explanation both globally and locally. Let's check this out in action!
NOTE: :: Experimental :: The implementation is currently experimental and might change in future.
Using the interpreter instance invoke call to the TreeSurrogate
surrogate_explainer = interpreter.tree_surrogate(oracle=im_model, seed=RANDOM_SEED)
Using the surrogate model to learn the decision boundaries learned by the base estimator
surrogate_explainer.fit(X_train, y_train,
use_oracle=True, prune='pre', scorer_type='f1')
Taking a look at the position for each feature
pd.DataFrame([('X'+str(idx), feature)
for (idx, feature) in enumerate(data.columns)]).T
Visualizing the Surrogate Tree
from skater.util.dataops import show_in_notebook
from graphviz import Source
from IPython.display import SVG
graph = Source(
surrogate_explainer.plot_global_decisions(
colors=['coral', 'darkturquoise'],
file_name='test_tree_pre.png'
).to_string()
)
svg_data = graph.pipe(format='svg')
with open('dtree_structure.svg','wb') as f:
f.write(svg_data)
SVG(svg_data)
Interesting rules from the surrogate tree
Here are some interesting rules you can observe from the above tree
Relationship < 0.5 (means 0) and Education-num <= 9.5 and Capital Gain <= 4225 → 70% chance of person making <= \$50K Relationship < 0.5 (means 0) and Education-num <= 9.5 and Capital Gain >= 4225 → 94.5% chance of person making > \$50K Relationship < 0.5 (means 0) and Education-num >= 9.5 and Educatin-num is also >= 12.5 → 94.7% chance of person making > \$50K Feel free to derive more interesting rules from this and also your own models! Let's look at how our surrogate model performs on the test dataset now
Surrogate Model Performance Evaluation
Just as expected, the model performance drops a fair bit but still we get an overall F1 score of 83% as compared to our boosted model's score of 87% which is quite good!
surrogate_predictions = surrogate_explainer.predict(X_test)
class_labels = list(set(labels))
meu.display_model_performance_metrics(
true_labels=y_test, predicted_labels=surrogate_predictions, classes=class_labels
)
SHAP (SHapley Additive exPlanations) is a unified approach to explain the output of any machine learning model. SHAP connects game theory with local explanations, uniting several previous methods and representing the only possible consistent and locally accurate additive feature attribution method based on what they claim! (do check out the SHAP NIPS paper for details).
Install
SHAP can be installed from PyPI
pip install shap
or conda-forge
conda install -c conda-forge shap
The really awesome aspect about this framework is while SHAP values can explain the output of any machine learning model, for really complex ensemble models it can be slow. But they have developed a high-speed exact algorithm for tree ensemble methods (Tree SHAP arXiv paper). Fast C++ implementations are supported for XGBoost, LightGBM, CatBoost, and scikit-learn tree models!
SHAP (SHapley Additive exPlanations) assigns each feature an importance value for a particular prediction. Its novel components include: the identification of a new class of additive feature importance measures, and theoretical results showing there is a unique solution in this class with a set of desirable properties. Typically, SHAP values try to explain the output of a model (function) as a sum of the effects of each feature being introduced into a conditional expectation. Importantly, for non-linear functions the order in which features are introduced matters. The SHAP values result from averaging over all possible orderings. Proofs from game theory show this is the only possible consistent approach.
An intuitive way to understand the Shapley value is the following: The feature values enter a room in random order. All feature values in the room participate in the game (= contribute to the prediction). The Shapley value $ϕ_{ij}$ is the average marginal contribution of feature value $x_{ij}$ by joining whatever features already entered the room before, i.e.
$$\phi_{ij}=\sum_{\text{All.orderings}}val(\{\text{features.before.j}\}\cup{}x_{ij})-val(\{\text{features.before.j}\})$$
The following figure from the KDD 18 paper, Consistent Individualized Feature Attribution for Tree Ensembles summarizes this in a nice way!

Let's now dive into SHAP and leverage it for interpreting our model!
import shap
Here we use the Tree SHAP implementation integrated into XGBoost to explain the test dataset! Remember that there are a variety of explainer methods based on the type of models you are building. We estimate the SHAP values for a set of samples (test data)
explainer = shap.TreeExplainer(xgc)
shap_values = explainer.shap_values(X_test)
pd.DataFrame(shap_values).head()
This returns a matrix of SHAP values (# samples x # features). Each row sums to the difference between the model output for that sample and the expected value of the model output (which is stored as expected_value attribute of the explainer). Typically this difference helps us in explaining why the model is inclined on predicting a specific class outcome.
print('Expected Value:', explainer.expected_value)
Predicting when a person's income <= \$50K
SHAP gives a nice reasoning below showing which features were the most influential in the model taking the correct decision of predicting the person's income as below \$50K. The below explanation shows features each contributing to push the model output from the base value (the average model output over the training dataset we passed) to the actual model output. Features pushing the prediction higher are shown in red, those pushing the prediction lower are in blue.
shap.force_plot(explainer.expected_value, shap_values[0,:], X_test_disp.iloc[0,:],
matplotlib=True)
Predicting when a person's income > \$50K
Similarly, SHAP gives a nice reasoning below showing which features were the most influential in the model taking the correct decision of predicting the person's income as greater than \$50K.
shap.force_plot(explainer.expected_value, shap_values[2,:], X_test_disp.iloc[2,:],
matplotlib=True)
One of the key advantages of SHAP is it can build beautiful interactive plots that can visualize and explain multiple predictions at once. Here we visualize model prediction decisions for the first 1000 test data samples.
shap.initjs()
shap.force_plot(explainer.expected_value, shap_values[:1000,:],
X_test_disp.iloc[:1000,:],
#matplotlib=True
)
WARNING! I'M GETTING AN ERROR IN JUPYTER LAB :( JUPYTER NOTEBOOK SEEMS TO WORK. JUST IN CASE, THE STATIC PICTURE FROM THE BLOG IS SHOWN.

The above visualization can be interacted with in multiple ways. The default visualization shows some interesting model prediction pattern decisions.
Definitely interesting how we can find out patterns that lead to the model making specific decisions and being able to provide explanations for them.
This basically takes the average of the SHAP value magnitudes across the dataset and plots it as a simple bar chart.
shap.summary_plot(shap_values, X_test, plot_type="bar")
Besides a typical feature importance bar chart, SHAP also enables us to use a density scatter plot of SHAP values for each feature to identify how much impact each feature has on the model output for individuals in the validation dataset. Features are sorted by the sum of the SHAP value magnitudes across all samples. It is interesting to note that the age and marital status feature has more total model impact than the capital gain feature, but for those samples where capital gain matters it has more impact than age or marital status. In other words, capital gain affects a few predictions by a large amount, while age or marital status affects all predictions by a smaller amount.
Note that when the scatter points don't fit on a line they pile up to show density, and the color of each point represents the feature value of that individual.
shap.summary_plot(shap_values, X_test)
SHAP dependence plots show the effect of a single (or two) feature across the whole dataset. They plot a feature's value vs. the SHAP value of that feature across many samples. SHAP dependence plots are similar to partial dependence plots, but account for the interaction effects present in the features, and are only defined in regions of the input space supported by data. The vertical dispersion of SHAP values at a single feature value is driven by interaction effects, and another feature can be chosen for coloring to highlight possible interactions.
You will also notice its similarity with Skater's Partial Dependence Plots!
PDP of 'Age' affecting model prediction
Just like we observed before, the middle-aged people have a slightly higher SHAP value, pushing the model's prediction decisions to say that these individuals make more money as compared to younger or older people.
shap.dependence_plot(ind='Age', interaction_index='Age',
shap_values=shap_values,
features=X_test,
display_features=X_test_disp)
PDP of 'Education-Num' affecting model prediction
Higher education levels have higher SHAP values, pushing the model's prediction decisions to say that these individuals make more money as compared to people with lower education levels.
shap.dependence_plot(ind='Education-Num', interaction_index='Education-Num',
shap_values=shap_values,
features=X_test,
display_features=X_test_disp)
PDP of 'Relationship' affecting model prediction
Just like we observed during the model prediction explanations, married people (husband or wife) have a slightly higher SHAP value, pushing the model's prediction decisions to say that these individuals make more money as compared to other folks!
shap.dependence_plot(ind='Relationship', interaction_index='Relationship',
shap_values=shap_values,
features=X_test,
display_features=X_test_disp)
PDP of 'Capital Gain' affecting model prediction
You might have observed a very similar plot in the Skater PDPs, here typically a Capital Gain of more than \$5K - \$8K leads to a huge spike in the SHAP values making the model push towards prediction decisions to say that these individuals make more money as compared to others!
shap.dependence_plot(ind='Capital Gain', interaction_index='Capital Gain',
shap_values=shap_values,
features=X_test,
display_features=X_test_disp)
Two-way PDP showing interactions between features 'Age' and 'Capital Gain' and their effect on making more than \$50K
The vertical dispersion of SHAP values at a single feature value is driven by interaction effects, and another feature is chosen for coloring to highlight possible interactions. Here we are trying to see interactions between Age and Capital Gainand also their effect on the SHAP values which lead to the model predicting if the person will make more money or not, with the help of a two-way partial dependence plot.
Interesting to see higher the higher capital gain and the middle-aged folks (30-50) having the highest chance of making more money!
shap.dependence_plot(ind='Age', interaction_index='Capital Gain',
shap_values=shap_values, features=X_test,
display_features=X_test_disp)
Two-way PDP showing interactions between features 'Education-Num' and 'Relationship' and their effect on making more than \$50K
Here we are trying to see interactions between Education-Num and Relationshipand also their effect on the SHAP values which lead to the model predicting if the person will make more money or not, with the help of a two-way partial dependence plot.
Interesting to see higher the higher education level and the husband or wife (married) folks having the highest chance of making more money!
shap.dependence_plot(ind='Education-Num', interaction_index='Relationship',
shap_values=shap_values, features=X_test,
display_features=X_test_disp)
Two-way PDP showing interactions between features 'Marital Status' and 'Relationship' and their effect on making more than \$50K
Here we are trying to see interactions between Marital Status and RElationshipand also their effect on the SHAP values which lead to the model predicting if the person will make more money or not, with the help of a two-way partial dependence plot.
This is interesting because both the features are similar in some context, we can see typically married people with the relationship status of either husband or wife having the highest chance of making more money!
shap.dependence_plot(ind='Marital Status', interaction_index='Relationship',
shap_values=shap_values, features=X_test,
display_features=X_test_disp)
Two-way PDP showing interactions between features 'Age' and 'Hours per week' and their effect on making more than \$50K
Here we are trying to see interactions between Age and Hours per weekand also their effect on the SHAP values which lead to the model predicting if the person will make more money or not, with the help of a two-way partial dependence plot.
Nothing extra-ordinary here, middle-aged people working the most make the most money!
shap.dependence_plot(ind='Age', interaction_index='Hours per week',
shap_values=shap_values, features=X_test,
display_features=X_test_disp)